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Abstract 

Linear wavelet density estimators are wavelet projections of the empirical measure based 
on independent, identically distributed observations. We study here the law of the iter- 
ated logarithm (LIL) and a Berry-Esseen type theorem. These results are proved under 
different assumptions on the density / that are different from those needed for similar 
results in the case of convolution kernels (KDE): whereas the smoothness requirements 
are much less stringent than for the KDE, Riemann integrability assumptions are needed 
in order to compute the asymptotic variance, which gives the scaling constant in LIL. To 
study the Berry-Esseen type theorem, a rate of convergence result in the martingale CLT 
is used. 
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1. Introduction 

Let X, Xi, X2, ...be i.i.d random variables in K with common Lebesgue density /. Let 
<f) G L2 0R) be a scaling function and ip the corresponding wavelet function. Let <fiok : = 
4>{x — k) and ipjk := 2^ 2 ij){QPx — k). {0ofe>V'ifc} forms an orthonormal system in L 2 (M). 
Every / G L P (M) has a formal expansion 

00 

f( x ) = ^2 "Ofc^OfcO) + ^2 Pjk4>jk(x)- (1.1) 
k j=0 k 

The linear wavelet density estimator is defined as 

jn-l 

f n (x) = ^ ®ok<frok(x) + ^2 Pjk*l>jk(x), (1.2) 

k j=0 k 

where j n is a sequence of integers, ct^ and are constructed by the plug-in method. Let 
Pn = -z Y^=i$Xi be the empirical measure corresponding to the sample {Xi}^ =l ,n G N. 
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Then 

1 n 

&jk = Pn{(t>3k) = ~J2 2j/2( t ) ( 2JX * ~ k )i (1-3) 
Tl . 
i=l 

hk = Pni^k) = -j2 2J/2 ^ 23X * - ( L4 ) 

71/ 

i=i 

They are unbiased estimators of a and /3. 

The use of this estimator first appeared in Doukhan and Leon (1990) and Kerkyachar- 
ian and Picard (1992). When satisfies certain properties, i.e., bounded and compactly 
supported, one may write f n (x) in a form similar to that of the classical kernel density 
estimator: 

fn,M ■= Ux) = —Y t K(^x,^X i ), (1.5) 

1=1 

where the projection kernel K(x, y) is given by 

K(x,y)=Y,<l>(x-k)<j>(y-k). (1.6) 

{2~ jn } is playing the role of the bandwidth in the classical kernel density estimation, and 
the sum is finite for each x and y. By Lemma 8.6, Hardle, Kerkyacharian, Picard and 
Tsybakov (HKPT, 1998), K(x,y) is majorized by a convolution kernel §(x — y) such that 

\K(x,y)\<$(x-y), (1.7) 

where $ : R — > R + is a bounded, compactly supported and symmetric function. 

A widely accepted measure of performance of an estimator is its mean integrated 
squared error, which is the expected value of the integrated squared error or L 2 error 
defined by I n := f(f n ( x ) — f(x)) 2 dx (see, e.g., Bowman 1985). The integrated squared 
error I n constitutes in itself a nice global measure of approximation of the density. And it 
is of interest to obtain the asymptotically exact almost sure rate of approximation, in this 
measure, of the density by an estimator of interest, often a law of the iterated logarithm. 
This was done by Gine and Mason (2004) for kernel density estimators, and it is done 
here for wavelet density estimators. We will refer to several results by Gine and Mason 
(2004), which will be abbreviated as (GM) in what follows. This type of theorems may be 
thought of as companion results to central limit theorems: whereas the latter gives rate of 
approximation in probability, the former deals with a.s. rate of convergence. The central 
limit theorem for the integrated squared error I n was obtained by Hall (1984) for kernel 
density estimators, and by Zhang and Zheng (1999) for wavelet density estimators. We 
also prove a Berry-Esseen type theorem as a complement to Zhang and Zheng's result. 
Doukhan and Leon (1993) obtained a bound on the rate of convergence in the CLT for 
generalized density projection estimates with respect to Prohorov's metric. However, their 
bound does not apply to the optimal window width. 

To study the integrated square error for the wavelet density estimator, we shall impose 
the following conditions: 
(/): f(x) is bounded. 
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(51) : The scaling function is bounded and compactly supported (e.g., Daubechies 
wavelet). 

Then, in ( II. 7p . we can assume $ is supported on [—A, A] for some A > 0. Set 6^(x) = 
J2k \4>i. x ~~ &)|- (SI) also guarantees that (see section 8.5, HKPT, 1998), 

ess sup 6^{x) < oo. (1.8) 

X 

(52) : < oo, where || ■ ||„ denotes the total variation norm of 0. 
The bandwidth {2~ jn } satisfies 

(Bl): j n ->■ oo, x n- 5 for some 5 G (0,1/3), where a n x 6 n means < 

liminf a n /b n < limsupa n /6 n < oo. 

(B2): There exists an increasing sequence of positive constants {Xk)k>i satisfying 

Afc+i/A fe -)> 1, log log A fc / log fc -)> 1, A fc+ i - A fc ->■ oo (1.9) 

as k — > oo, such that 2~- ?n is constant for n € [A*., Afc+i), fceN. For instance, the sequence 
Afc = exp(/c/ log(e + fc)) satisfies these conditions. 

We will prove the following theorems for the statistic 

Jn-=\\fn,K- f\\l-nfn,K- f\\l (1.10) 

Theorem 1.1. Let f,(fi and j n satisfy hypotheses (f), (SI), (S2), (Bl) and (B2). Set 
a 2 ■= 2 J R f 2 (x)dx. Then, 

limsupi — -4 = 1) a.s. (1.11) 

n->oo aV^loglog'a 

Theorem 1.2. Assume the hypotheses (f), (SI), (Bl) and that there exists L > sitc/i 
i/iai / zs Holder continuous with exponent < a < 1 on [—L,L]: f is monotonically 
increasing on (— oo, —L] and monotonically decreasing on [L, oo). Let Z ~ jV(0, 1). Then 
there exists a constant C (depending on f, <fi and {jn)), such that 

sup | Pr{n2~ j " /2 J n < t) - Pi{aZ < t}\ < C(n- 35/w V n^yfiogn) (1.12) 
t 

where a 2 = 2 Lf 2 (x)dx. 

For example, if 2~ jn x n" 1 / 5 , sup t | Pr{n2^"/ 2 J n < t} - Pr{aZ < t}\ < C(n- 3 / 80 V 
n~~ a ^y/\ogn). No claim of optimality of the rate obtained is made. 

Zhang and Zheng (1999) used the fact that J n coincides with its stochastic part, J n , 
where 

Jn ■= \\fn,K - E/n^Hl " n\fn,K ~ ^fn, K \\l (1-13) 

This is due to the orthogonality of the wavelet basis. We will include a short proof later for 
completeness. Thus, there is no need to analyze the bias part and assume more regularity 
conditions on the density / as is done in the kernel case (e.g., Hall, 1984; GM, 2004). 
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Next we set up some notations. Let K be the projection kernel associated with the 
scaling function as in (11. 6p . Set 



K n {t, x) := K{2 j "t, 2 in x) and K n (t, x) := K n {t, x) - EK n {t, X). 
Then by (1L~T3|) . 



J, 



n o 

n z 



J2 K ( 2jn ^ 2inX i)J dt-E J \^2K{2H,2 jn Xi)j dt 



;i.l4) 



n 



2 W n ^j, 



where 



W n (F) : 



/ (^K(2 jn t,2 jn Xi)j dt-E J ^2K(2 j "t,2 jn Xi^j dt 



;i.!5) 



U n (F) + L n (F), 



U n {F)= \ Kn&XjKnfaXAdt, L n {F) = J2 f {Kl^X^-EKl^X^dt. 

(1.16) 

The measurable set F will normally be E, [-M, M] or [-M, M] c , M > 0, with / F /(t)dt > 
0. But in the results below, F can be any set with this property and such that 



\({x + y : x E F, \y\ < e} n F c ) -)> as e ->■ 0. 



(1.17) 



The proof of Theorem II .ll for the most part follows the same pattern in (GM): For some 
M large enough, W n ([—M, M] c ) is shown to be negligible by using an exponential inequal- 
ity for degenerate U-statistics (Gine, Latala and Zinn, 2000) and Bernstein's inequality for 
the diagonal term. Therefore, we may truncate J n and deal with W n ([—M,M}). This is 
approximated by a Gaussian chaos using strong approximations (Komlos-Major-Tusnady 
inequality) and a moderate deviation is proved for it. Finally, one deals with the usual 
blocking of laws of the iterated logarithm. Here it can be implemented again because 
of Bernstein type exponential inequalities for [/-statistics. However, due to the fact that 
K(x, y) is not a convolution kernel, the computation of the limiting variance turns out to 
be a major difficulty, which we surmount using ideas from the proof of CLT in Zhang and 
Zheng (1999). For this we require / to be (improper) Riemann integrable on M, and this 
is the purpose of condition (/). 

In order to get the convergence rate in CLT, we need to assume more conditions on 
/. J n is composed of L n (R) and U n (M). The exponential inequality for U-statistics is 
used to show L n (M.) is negligible. Then U n (M) is approximated by a martingale and the 
rate of convergence was obtained using Erickson, Quine and Weber (1979) 's result. The 
U-statistics method and the application of the martingale limit theory can be traced back 
to Hall (1984). It makes the study of L 2 error easier, but it does not apply to L p error if 
p^2. 
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The article is organized as follows. In section 2 we collect the variance computation 
results. In section 3 we state results of tail estimation. In section 4, we obtain a moderate 
deviation result for W n ([—M,M]). In section 5, we complete the proofs of Theorem 11.11 
and Theorem 11.21 In the appendix, we give proofs to some lemmas stated in section 2. C 
is a universal constant which might differ from line to line. 



2. Variance Computations 

We present here some inequalities and variance computations used throughout the paper. 
Only the exact limits present problems and must be treated differently than in the case of 
convolution kernels, but upper bounds can be dealt with essentially as in the convolution 
kernel case because of the majorization property (II. 7p . We will state these results without 
giving detailed proofs. They can be verified by replacing, in the corresponding proofs by 
(GM), the bandwidth h n by 2 _Jn and the projection kernel K(x, y) by a convolution kernel 
$(a; — y) that is given by (II. 7p . More specifically, if the kernel K(x,y) satisfies (II .7p . we 
have the following estimates: For all x and y, and all measurable sets F, 



[ K 2 n (t,x)dt< 4- 2-i"\\<!>\\l, (2.1) 
Jf 



Kl(t,x)dt-E J K 2 n (t,X)dt 



<8-2 -i "||$||2, (2.2) 



and by Cauchy-Schwarz, 



J \K n {t,x)K n (t,y)\dt < 4 • 2-i"\\m (2.3) 
We have an analogue to Corollary 2.7, (GM). 

Corollary 2.1. Assume (f), (SI) and (Bl) hold, and that F satisfies condition (j!.17p . 
Then there exists = uq(F) such that, for all n > no, 

Var / Kl(t,X)dt <8-2- 2jn \\<5>\\ 4 2 [ f(x)dx. (2.4) 

J F J F 



And for all n, 



Set 



Var J Kl(t,X)dt<4:-2- 2jn \m 4 2 . 



(2.5) 



C n (t,s):=2^ / K n (t,x)K n (s,x)f(x)dx, R n {t,s):=2 ln / K n (t, x)K n (s, x)f(x)dx. 
Jr Jr 

(2.6) 

Define the operator TZ n ^ for cp 6 L 2 (F), 

K n> M s ) = J R n (s,tMt)dt. (2.7) 
The next three lemmas are similar to Lemmas 2.3, 2.4 and 2.5, (GM). 
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Lemma 2.2. Under the hypotheses of Corollary \2.1\ for the operator TZ n ,F, we have 



sup{||7Z n ,^|r^ : |M| 2 = l,<pe L 2 {F)} < 2- 2 ^C($,/) 



where 



C($J) = 2\\*\\\ 



+ 



8- 



Lemma 2.3. Under the hypotheses of Corollary \2.l\ 



lim sup 2 > 



(%(s,t)dsdt < / f 2 {x)dx 



F 2 



$(w + u)$(w)dw ) du 



(2.8) 
(2.9) 

(2.10) 



< jj\x)dx\\mwni 



Lemma 2.4. Under the hypotheses of Corollary 12. 1L 



(C n (s,t) - R n (s,t)) 2 dsdt < 2~ jn \\m\\f\\i^ as n ->■ oo. 



(2.11) 



Note that in Lemma I2.3[ we can only get an upper bound instead of the limit using 
the same method from the convolution kernel case. Calculation of the exact limit of 
2 Jn fr_M R n (s,t)dsdt is the key to obtaining the scaling constant in LIL. By Lemma 
12.41 we shall approximate it by 2 Jn f, M M p C 2 (s,t)dsdt and calculate the limit of this 
quantity. 

Lemma 2.5. Assume (f) and (Bl) holds, and the scaling function <fi satisfies (SI) such 
that the kernel K associated with is dominated by $ whose support is contained in 
[—A, A], where A is an integer. Then for any M > 0, 

•M 



lim 2 J " 

n— >oo 



-M,M} 2 



C n (s, t)dsdt 



f(y)dy. 



(2.12) 



M 



In order to prove Theorem ll.2[ we need to estimate how fast 2 J " J R2 C 2 (s,t)dsdt con- 
verges to J R f 2 (y)dy. This can be done by imposing more regularity conditions on /. 

Lemma 2.6. Under the hypotheses of Lemma 12.51 and assume that, in addition, f is 
Holder continuous with exponent < a < 1 on [-L,L], and monotone on tails (— oo, —L] U 
[L, oo), where L > 0. Then for all n, there exists a constant C (depending on f, (f> and 
{jn\ ), such that 



2 >: 



C 2 n (s,t)dsdt- / f 2 {y)dy 



< Cn 



-da 



(2.13) 



where 5 G (0, 1/3) is the same as in (Bl). 



Together with Lemma I2.4[ we obtain 



Corollary 2.7. Assume the same conditions in Lemma \2.6\ for all n sufficiently large 
depending on f and {j n }, 



2» / R 2 n (s,t)dsdt- / f 2 (y)dy 



< C{n 



-5/2 + n - Sa) 



(2.14) 



where the constant C depends on f , (ft and {j n }- 

The proofs of Lemmas 12.51 and 12.61 are provided in the appendix. 
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3. Tail Estimation 



The goal of this section is to obtain exponential inequalities for W n (F), where F satisfies 
( I1.17P and also for W n (R) — W ntm (M). We assume throughout this section that (f) satisfies 
(SI), and K is associated with given by (11. 6p . 
Set, for m < n, 



I ( K n (t,Xi)\ R n(t,Xi)\ 

^ R \m<i<n J \m<i<n J 



dt 



and 



(3.1) 



H n (x,y):= / K n (t,x)K n (t,y)dt, H n ^ F (x, y) = / K n (t,x)K n (t,y)dt. (3.2) 
Jr Jf 

With this notation, 

n 

U n (F)= Hn^XuXj), L n (F) = Y t {H ntF (Xi,X i )-EH ntF (X i ,X i )), (3.3) 



and 

W n (R) - W n , m (R) 

m n m 

= 2 J2J2 H n (X i ,X j )+ H^X^Xj) + J2(H n (X t ,X t ) -EH n (X t ,X t )) . 

8=1 j=m+l l<i7^J<Ti i=l 

(3.4) 

Bernstein's inequality (e.g., de la Pena and Gine, 1999) says that for centered, i.i.d. 
random variables if ||Ci||oo < c < oo and a 2 = E^f, then 



pr iZ)^>4 - exp 



8=1 



2m(7 2 + 2ct/3 



(3.5) 



Applying it to the 3rd term in the above equation, given Corollary I2.1[ and inequality 
( 12. 2p . we obtain 



Pr 



< 2 exp 



Y J {H n {X l ,X i ) - EH^Xi)) 

T 2 n 2 2 -3j n 



i=l 



> rn2 2 



Jn 



(3.6) 



3 m 2-2in||$||4 + ^™2-2^||$||2 



The first two terms in (13. 4p are of U-statistics type. They can be controlled by the 
following exponential inequality for canonical U-statistics. 
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Theorem 3.1. (Gine, Lataia, Zinn, 2000) There exists a universal constant L < oo such 
that, if hij are bounded canonical kernels of two variables for the independent random 
variables (x\ l \ X^), i,j = 1,2, ...,n, and if A,B,C,D are as defined below, then 



Pr 



l<i,j<n 

for all x > 0, where 

d = ||(MIU 2 -^ 2 

« j 

(3.8) 

c 2 = J2 m U x ^ x i)> ( 3 - 9 ) 



> x > < L exp 



--mm I— — —,— 



(3.7) 



= max 



(2> 



(3.10) 



and 



A = maxll/iijlloo- (3-11) 



Theorem 13 .11 also holds if the decoupled U-statistic Xa<i j<n > -^j ) * s replaced 

by the undecoupled U-statistic Yui<tyj<nh*d(XiiXj). We will take hij = H n ,F,i,j = H n F , 
calculate the constants A, B, C, D in Theorem 13 .![ and apply it to J2i<i^j< m H n ,F(Xi, Xj). 
(O gives ^ 

A<4-2~ jn \\m, B 2 <lQm-2~ 2jn \\m (3.12) 

By Lemmas 12.31 and I2.4[ for n large enough depending on F, 



C 2 < 2m 2 ■ 2- 3jn \\m\\m [ f 2 (x)dx. 

Jf 



(3.13) 



If / satisfies condition (/) and <f> satisfies condition (SI), the bound on D can be calculated 
by following the proof in the kernel case and making obvious modifications there. 

D <Am2- 2 ^\\f\\ 00 \\^\\ 2 l . (3.14) 

Proposition 3.2. Let Xi be i.i.d. with density f satisfying condition (f). Let F be a 
measurable subset of R satisfying condition (11.171) . <fi satisfies (SI) and K is the projection 
kernel associated with 0. 2~^ n — > 0. Then there exist constants k,q (depending on f and <f>) 
and no(depending on F, f, <fi and the sequence {j n }) such that, for all r > and for all 
n > no, < m < n, 



Pr 



< kq exp 



H n ,F(Xi, X 

1 



> rn2 ^ 



mm 



T 2 n 2 



rn r 2/3 n 2/3 2 -j„/3 
T 1/2 n 1/2 2~ 3n/ 



m 2 J F f 2 (x)dx^ m2 W 2 ' m 1 / 3 



(3.15) 



and 



Pr 



m n 



^ 7 j H n)F (Xi,Xj) 

i=l j=m+l 



> rn2 2 J " 



1 



< kq exp min 

Kq 

r 2/3 n 2/3 2 -in/3 

(m V (n — m)) 1 / 3 



r 2 n 2 



rn 



m(n - m) j F f 2 (x)dx' ^/m{n - m)2-^/ 2 



(3.16) 



T 



l/2 n l/2 2 -j„/4 



Proo/. Gathering Theorem EU f l3~12l . fl3~13l and (l3TTi]l . we get f l3~T5l) . ( 13T6]) can be 
obtained in a similar way. □ 

Using this and (13. 6 p for the diagonal L n (F), we also have 

Proposition 3.3. Under the same hypotheses of Proposition 13721 on f, <p and {j n }, there 
exist constants k (depending on <f> and f) and Uq (depending on F, f, <f> and the sequence 
{jn} ) such that, for all r > and for all n > n , 



Pr 



{\W n (F)\ 



> rn2 2 



.In 



1 



< Kq exp min 

Kq 



T 



f F f 2 (x)dx' 



2 W2 r 2/3 n l/3 2 -f l/2 n l/2 2 -f 2 nrJ 



(3.17) 



In particular, if the sequence 2 3n satisfies condition (Bl) and r = n-^/log logn, t/ie /irst 
term dominates. For every r\ > i/iere exist k and n as above such that 



Pr ||W„(F)| > 7]n2 ^ jn ^J\og log nj < K exp 



T) 2 log log n 
Kq J F f 2 (x)dx 



(3.18) 



for all n > n Q . 



Now the three terms in the decomposition of W n (M) —W n<m (M.) in H3.4[) can be bounded. 
The first two are of the U-statistics type, so Proposition 13.21 is used to obtain the estima- 
tion. The last one is a sum of mean zero i.i.d. r.v.'s and can be dealt with by (13. 6p . 

Lemma 3.4. Under the same hypotheses of Proposition [3721 on f, cf) and {jn}, there exist 
a constant k q (depending on f and <fi) and r\ > such that, for all e > ; cr > ; if n is 
large enough (depending on f , <p one? {j n } ), and m fixed is such that < m < n, 



Pr{|W„(R) - Wn,m(R)| > e<rn2 _3jn/ V21oglogn| < K exp 



e 2 n^ 



Kq 



(3.19) 
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4. Moderate Deviations 

In this section, we'll prove a moderate deviation result for W n ([—M,M}). This statistic 
can be approximated by a Gaussian chaos due to the Komlos-Major-Tusnady (KMT) 
theorem and the Dvoretzky-Kiefer-Wolfowitz (DKW) inequalities. Then a moderate de- 
viation result in (GM) is used for the Gaussian chaos, (ft satisfies both (SI) and (S2). 

Let F n (t) := X/ILi 1 G^i — anc ^ &n be a sequence of Brownian bridges. For all 
i£R, set 

2 J 'n 



£7 n (z) : = V^2 = ^'[/„ ) jc(x) - Ef n , K (x)\ = J—Y\K{2^x } 2^X t ) - EK(2?»x, 2?»X)] 

V n z — ' 

i=i 

= Vrtl^ I K(2 jn x,2 jn t)d[F n (t) - F(t)}. 

(4.1) 

Let K n x (t) := K(2P n x, 2 J "t) and ^K nx {t) be the Borel measure associated with K n x (t). 
Define the Gaussian process 

r„(aO := 2^ 2 / [B n (F(x)) - B n (F(t))]dfi Kn , x (t). (4.2) 
Jr 

We want to approximate 

- W n ([-M, M]) = 2^ 2 / [{E n {t)f - E{E n {t)) 2 )] dt (4.3) 

J-M 



M 



n 

by a Gaussian chaos: 

2^ 2 I [(r n (t)) 2 - E((T n (t)) 2 )} dt. (4.4) 

J-M 

In order to apply the KMT theorem, we need an integration by parts formula for E n (x). 
This requires us to check two conditions: (i) F n (t) — F(t) and K n<x (t) are in the space 
NBV, where NBV is defined by 

NBV = {G is of bounded variation, G is right continuous and G(— oo) = 0}. 

(4.5) 

(ii) Almost surely, for fixed N, there are no points in [— N, N] where F n (t) — F(t) and 
Kn,x(t) are both discontinuous. 

For any m G N, let { — oo < to < ••• < t m = t} be a partition over (— oo,t). Then 

m m 

£ \K n , x {ti) - K n , x {t^ x )\ < £ \4>{2^x -k)\J2 I0( 2 H ~k)- <f>(2 jn ti-i - k)\ 

i=i k 1=1 (4-6) 

k 

Since <p satisfies fll.Sp and (S2), we have, for almost every x, 

\\K n>x \\ v < I0( 2ift z - k)\U\\ v := C^, (4.7) 
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where C§ is a constant that depends only on the scaling function 0. The other conditions 
in (i) are obvious. To verify (ii), we note that K n ^ x {t) could only have discontinuities at 
dyadic points whereas F n (t) — F(t) could only have discontinuities at X i: 1 < i < n. 

Then we apply an integration by parts formula (Ex. 3.34, Folland 1999) to the integral 
/rjvjv] K n ,x(t)d[F n (t) — F(t)] and let N — >• oo. By dominated convergence, this gives 

/ K niX {t)d[F n {t) - Fit)} + f (F n (t) - F(t))dfi Kn:X (t) = 0. (4.8) 
Jr Jr 

Moreover, since f R dfXK, liX (t) = 0, 

E n {x) = Vn^ [ [F(t) - F n (t) - (F(x) - F n (x))]dii Kn:X (t). (4.9) 
Jr 

Now we are able to bound the difference between ( 14.31) and (I4.4H . We set a n {t) : = 
VE[F n {t) - F(t)] and D n := sup_ 00<4<00 \a n {t) - B n {F{t))\. We have 



D n {M) : 



2 rM 



W n ([-M, M)) - 2^/2 / ((r B (t)) 2 - E((T n (t)) 2 )) dt 

n J-m 

< I 3 - -AMD^ ess sup(\E n (x)\ + \T n (x)\) 



< 2 3 ^/ 2 8MD n (||a n || 0O + llfinlU)^. 
We use the KMT theorem for D n and the DKW inequalities for ||a n ||oo and ||-B n ||oo- 

Theorem 4.1. (Komlos, Major, Tusnddy, 1975) There exists a probability space (Q, A, P) 
with i.i.d random variables Xi, X 2 , with density f and a sequence of Brownian bridges 
Bi, B 2 , such that, for all n > 1 and 

Pr {D n > n~ l/2 (a\ogn + x)} < 6exp(-cx), (4.11) 

where a, b and c are positive constants that do not depend on n, x or f. 

The DKW inequalities (Dvoretzky, Kiefer, Wolfowitz, 1956; or see Shorack and Well- 
ner, 1986) give that, for every z > 0, 

Pr{|K|U > z) < 2exp(-22 2 ), Pr {{{BJ^ > z) < 2exp(-2z 2 ). (4.12) 

We arrive at the following proposition. 

Proposition 4.2. Assuming the scaling function <p satisfies (SI), (S2) and j n satisfies 
(Bl), for any 7 > there exists Cm,4> > such that 

Pr { D „(M)>^f }<„- (4.13) 

for all n > 710(7). 
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Proof. For 7 > 0, take x = 2~f\ogn/c in (14. lip . If n is sufficiently large depending on 7, 



Pr<D„> 



27 



a H log n > < 6 exp (—27 log n) < -n 



From DKW inequalities (14. 12ft . it is easy to see that for n large enough, 



1 



Pr < Ha 



\B r , 



> 



logn 

a + 27/c 



~ 2 



(4.14) 



(4.15) 



Combining these with (I4.10p . we get 

8MC 2 (logn) 2 



Pr { D n (M) > 



1 



2-3in/2 
2 7 



< Pr < D n > —= [a+ — logn > + Pr < On «, + Pn L > 



< 



logn 

a + 27/c 



(4.16) 



□ 



Setting G M a = 8MCj yields ( QS]) . 

It is easier to obtain a moderate deviation result for 2 Jn / 2 J^ // ((r n (t)) 2 — E((T n (t)) 2 ))dt 
than for 2 3 '"/ 2 W n ([-M, M])/n. For the former we can adapt the method in (GM) where 
they obtain a moderate deviation result for similar random variables by adapting a method 
of Pinsky (Pinsky, 1966) to prove the LIL for sums of random variables with finite moments 
higher than 2. It is a well-known fact that J_ M ((T n (t)) 2 - E((T n (t)) 2 ))dt can be written 
as a sum of weighted, centered chi-squared random variables(e.g., Proposition 4.3, GM, 
2004). Recall the operator 7Z n ,F defined in (12. 7L Let A ni i > A nj 2 > . . . > be the 
eigenvalues of the operator lZ nt F with F = [— M, M]. are i.i.d M(0, 1). We then have 



» 00 
/ [(T n (t)) 2 - E(r n (t)) 2 ] alt = Kk{Z 2 k - 1). 
Jf k=i 



(4.17) 



The limiting variance is calculated using Lemmas 12.41 12.5 

lim 2 jn E 



M "I 2 

((r n (t)) 2 - E(r n (t)) 2 ) dt 



k=l 



lim 2 ■ 2*> X lk 

•M 

Rn(s, t)dsdt 
f 2 {x)dx =: a 2 {M). 



(4.18) 



lim 2 • 2 Jn 

n— >oo 

/M 
-M 



M J-M 
2/„,\J„, . _2/ 



Set 6 n := ^A ni i/ JY^=i Ki,k) for some < ^ < 1 and 



V n (M) 



2-?w " 2 
a(M) J_ 



ill 



(r n (t)) 2 - E{v n {t)) 2 )dt 



(4.19) 
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Using (I4.17P and a modification of Pinsky's method, we have a moderate deviation for 
V n (M), which is parallel to (4.15), (GM). For any sequence a n converging to infinity at 
the rate a\ + log6 n — > — oo and for all < e < 1, 

exp f- a ' (1 2 +£) ) < Pr {±V n {M) > a n } < exp f- ^" 6 ^ ( 4 .20) 

if n is large enough depending on e. 

We can use this result, the triangle inequality and Proposition 14.21 to obtain: 

Proposition 4.3. Let a n = C a/2 log logn, < C < oo. Under the hypotheses of Propo- 
sition Wf2\ and further assuming that f satisfies condition (f) and that J_ M f 2 (x)dx > 0, 
then we have a two-sided inequality, 

-» (- aj ^n * * {±is^™> >- -} <- -> (-^) 4 

(4-21) 

for all < e < 1 and n /arge enough (depending on M and e). 

5. Main Proofs 

5.1. Theorem W.W 

Proof. We show that J n = J n , where J n is defined in (11.131) . Since we have, 

Jn= I fn, K ~ ®fn,K ~ Vu,kJ + 2fEf n , K , (5.1) 

Jr 

Jn= I fn,K ~ 2fn,KEf n , K - Ef^ + 2 {Ef n , K f . (5.2) 



and 



It remains to show that the difference 

Jn ~ Jn = 2 f(f - Ef n , K )(Ef n<K - f n>K ) = 0. (5.3) 



^fn,K — fn,K is a linear combination of {0ofc} and {ipjk}, < j < j n — 1, whereas 
/ — Ef nt K is a linear combination of {?pjk}, j > jn- By orthogonality of {0o/c, V^'fc}) we 
have J n — J n = 0. Thus the proof of Theorem 11.11 reduces to proving that 

limsupi — J w = 1, a.s. (5.4) 

n.^oo crv 2 log logn 

By (11.141) . this is equivalent to 

2 3 *-/ 2 W n (R) 

hmsup± = = 1. (5.5) 

nay 2 log logn 

Since we have analogous variance computation, tail estimation and moderate deviation 
results to those for the kernel density estimator, the proof is the same as in Theorem 5.1, 
(GM). We give an outline of the proof but readers should refer to (GM) for details. 
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(i) Proof of the lower bound: Lemma [3.41 and Borel-Cantelli implies that the random 
WJR) 

variable lim sup -— — ^^=^^= is measurable with respect to the tail cr-algebra of 

n crn2~ 3 -?"/ 2 v21oglogn 

Xi. We assume the lower bound is not true. In particular, we choose r k = k k , then there 

exists c < 1 s.t. 

lim sup , ^< R > a,. (5.6) 

k ar k 2 d >fe/V21oglogr& 

The proof of Lemma EH also applies to JV rfc (R) — W / rfeirj ,_ 1 (lR) since r k /r k ^i > k. And we 
have 



\W rk (R) - W rk , rk _^. ^ 

— — j- — — — > (J a.s. (5.7) 

r fc( x2- 3 > fc /V21oglogr fc 

Thus 

W rkrk (R) 

lim sup — — = c a.s. (5.8) 

k or k 2 63r k/ 2 y/2\oglogr k 

By Borel-Cantelli, there exists d satisfying c < c' < 1, s.t. 



k 

Set m k '■— r k — r k -\ and define 



rr < — — ... > c > < oo. 5.9 

I ar fc 2- 3 >fe/V21oglogr fc ~ ' 



fr k -r k _ 1 \ 2 /r k -r k _ 1 ■• 1 



w^ k (R) -.= j \Y1 K^n^xA dt-mj (j2 K^n^x,)] dt. 



(5.10) 

Since W4 fe (R) and W^^.^R) have the same distribution, (I5.9P holds with W rki r h _ 1 (M) 
replaced by W' mk (R). This and m k /r k — > 1 imply that there exists c" satisfying c' < c" < 1, 
s.t. 

J^Pr |w^ fe (R) > cVm fe 2- 3jV ^ 2 ^2 log log m k } < oo. (5.11) 

k 

We choose M large enough so that Jj_ MM ]c f 2 (x)dx < (5c"cr) 2 / k , where n is the con- 
stant in dSH. W^ k (R) can be split into W^ h ([-M,M\) and W^Q-M, M] c ). (I3"TT%|) is 
used for W^ fc ([-M, M] c ) and Proposition |3 for W^Q-M, M]). Then we would reach 
a contradiction to (15.1 ip and thus prove the lower bound. 

(ii) Proof of the upper bound: We shall first use conditions (51) and (B2) to introduce 
a blocking and reduce W n (R) to W nk (M.) for the sequence n k := min{rt e N : n > \ k }. n k 
satisfies the same properties as \ k does. I k is the block defined by I k := [n k , n k+ i) D N. 
is nonempty for k > ko- 

By Borel-Cantelli, it suffices to show that, for every S > 0, 



V Pr imax |W„(R)| > (1 + 5)an k 2~ 3jn ^ 2 ^/2\og\ogn k I < oo. 



We will prove that for every r > 



Pr | max |W n (R) - W nk (R) | > Tan k 2~ Zinh /2 ^2 log log n fc i < oo. 
k>k ^ ™ e/fc J 



(5.12) 



(5.13) 
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For n E I k , similar to (13. 4p . we have 



n k n 

W n (R) - W nk (R) =2^T H nk (X i ,X j )+ J2 H nk (X t ,X 3 ) 

1=1 j=n k +l «fc<*7^i<" 
n 

+ (Hn^Xi) -EH^X^Xi)). 

i=n k +l 



(5.14) 



H n is replaced by H nk since {2 _Jn } is constant for n G 4 by hypothesis. We will apply 
Montgomery- Smith maximal inequality (Montgomery-Smith, 1993) to the first and the 
last summands directly: If Xj are i.i.d r.v.'s taking values in a Banach space and || • || is 
a norm in the Banach space, then 



Pr { max 

Kk<n 



E* 



1=1 



> t )> < 9Pr 



E A "* 



i=i 



> 



30 



(5.15) 



However, the second summand is not a sum of i.i.d random variables. A decoupling 
inequality (e.g., de la Pena and Gine, 1999, Theorem 3.4.1) is used to transform it into 
independent variables, i.e., Yl,n k <i^j<n H nk {x[ l \ Xy), where X^ and Xf\ i,j G N are 
i.i.d. copies of X\. Then we add the diagonal, apply Montgomery-Smith inequality twice 
and subtract the diagonal at last. We will be able to reduce (15 . 13f) to proving that, for 
every r > 0, 



k>ko 



n k+1 -l n k 
j=n k +l i=l 



> Ton k 2 3jn fc /2 A/2 log \ogn k > < oo, 



(5.16) 



k>ko 



«fc+i-l 

J2 (H^X^Xd-EH^X^Xi)) 

i=n k +l 



> ran k 2 3jn * /2 v / 2 loglogn fe J < oo, 

(5.17) 



and 



k>ko 



"fc+i-1 "fc+i— 1 

E E ^(a?U 

j—n k + l i=n k +l 



(1) v( 2 )> 
3 ' 



> Tan k 2~ 3jn "/ 2 log log n k } < oo, (5.18) 



k>ko 



i=n fc +l 



> rcm fc 2 3j "fc /2 v /21oglog 



n k > < oo. 



(5.19) 



(15.16p . ( 15.17P come from the first and last summands in (15.141) whereas (15.181) . (15.191) 
come from the second summand. We apply Bernstein's inequality to (I5.17P and (I5.19p . 
Proposition 13.21 will take care of (I5.16P and (I5.18p . Therefore, (I5.13P is proved. Thus 
(I5.12p is reduced to showing that for every 5 > 0, 



^Pr{|W/ nfc (R)| > (1 + <5)an fe 2-WV21oglogn fc } 



< oo. 



(5.20) 



k>k(, 
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The second step is to reduce W n k(M) to W n k([—M,M]) for some M large enough. 
Given 5 > 0, there exists M < oo such that f, M M i C f 2 (x)dx < 5 2 a 2 / (4k ), where k is 
the constant in inequality (13. 18ft . Application of ( 13. 18ft gives that, from some k on, 



P*{\Wn k ([-M,Mf)\ > ^orn fc 2- 3 ^/V21oglogn fe | < k exp (-2 log log 



n k ), (5.21) 



where the right hand side is the general term of a convergent series. Let e be so small 
that (1 + <5/2) 2 (l — e) > 1. Now we use (14.21 j) to obtain that, for large enough, 



Pr[\W nk ([-M,M])\ > (1 + -)™ fc 2-WV21oglogn fc 
<Pr||W nfe ([-M,M])| > (1 + 5 -)a(M)n k 2-^' 2 ^2\og\ogn k 
< exp (-(1 + 6/2) 2 (l - e) loglogn fc ) + -j , 



(5.22) 



which is also the general term of a convergent series. Hence the series (15.201) converges 
for every 5 > 0. 

□ 



5.2. Theorem \1.2\ 

Proof. Without loss of generality, we will assume that, for all n, there exist constants C\ 
and C*2, such that C\n & < 2- 7n < C2n s . Proving Theorem 11.21 is equivalent to proving that 



sup | Pr{n2- j " /2 J n < t} - Pr{aZ < t}\ < C(n" 35/16 V n^yfiogn). 



(5.23) 



By (TTTD and (TLToD . we have that 

n2- ]nl2 J n /a = WJR) = U n 

na na 



+ Ln 

na 



(5.24) 



Using the triangle inequality, we can obtain an upper bound and a lower bound for this 
statistic. For an arbitrary positive sequence t\ n , 



sup 

t 

< sup 

t 



Pi{n2- jn/2 J n /a <t}- Pr{Z < t}\ 



Pr 



+ Pr 



na 

2 3 i«/ 2 



U n (R) <t>- Pr{Z < t} 



(5.25) 



na 



L n (R) | > e hn \ + sup Pr {t - e l>n <Z <t + e 1>n } 



It's easy to bound the last term: 



sup Pr {t - e 1>n < Z < t + ei >n } < e 1; 
t 



(5.26) 
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By (USD, for < ei, n < 1 so that e\ n < ei, n , 

Pr {|L n (M)| > ae hn n2- 3jn/2 } < Cexp ^-^min (a 2 e 2 n n2^", oe hn n2- jn/2 )^j 

< C exp 



.2 



c 



(5.27) 



where C depends on both cj> and /, 5 £ (0, 1/3). We may take ei 5 „ = n 1 / 3 to obtain 

Pr {|L„(R)| > ae hn n2- 3jn/2 } < Cexp(-logn) = Cn' 1 (5.28) 
when n is large enough. Using (15. 26ft . we get 

sup Pr {t - ei, n < Z < t + ei >n } < n~ 1/3 . (5.29) 



To control the first term in (15.251) . we will approximate 2 3 - ? '™ //2 £/ n (R) / (no) by S nn , which 
is defined below. We set 

n i—1 



P ) 



(5.30) 



and 



then 



i=2 j=l 



v v-> H n (Xj,Xj) \ ^ Y 



i=2 



n i—1 



Snn — E E 

i=2 j=l 



H n (Xi, Xj 



(5.31) 



(5.32) 



Analogous to (I5.25p . for any positive sequence e 2j 
Pr <! U n (R) <t>- Pr{Z < t} 



sup 

t 



no 



<sup|Pr{S nn <t}-Pr{Z<t}| 

t 

2 3 W 2 



(5.33) 



Pr 



no 



-U n (K) — S nn 



> e 2 ,n| + supPr{t - e 2 , n < Z <t + e 2 ,„} . 



By (E3D, 



Pr 



where eL 



23in/2 



Un 



no 

2 3 in/ 2 i 



> e 2„ 



Pr 



l<ij^j<n 



> ^ , (5-34) 



no 2s r 



. We then estimate the order of d n . Recall the definition of 



1 /9 

R n (s,t) in ( 12. 6ft and set e n := (2 Jri J R2 i? 2 (s, t)dsdt) . Using the definition of s 2 and 
Fubini's theorem, we get 

n i-1 , _ 1 \ 

4 = E E E ^( X - *i) = ^V^S-^e 2 . (5.35) 

i=2 3=1 



17 



Plugging it into d n and using a triangle inequality, we then have 



d n <C2 3jn/2 



1 



n \ I f 2 (x)dx 



+ C2 3jn/2 



n(n — 1) J f 2 (x)dx 
1 



yjn(n- l)ff 2 (x)dx V n ( n ~ l ) e n 



(5.36) 



Since 2^ < Cn 5 for some 5 G (0,1/3) and 1 / y/n(n — 1) — 1/n < n 2 when n > 2, 
the first term is bounded by Cn 35 / 2 " 2 . Corollary 12.71 gives that \e 2 l — J R f 2 (x)dx\ < 



C{n 



-8/2 



n 



— <5a N 



. The second term is bounded by Cn 36 / 2 x i 



n 



-5/2 



+ n ) when n is large 



enough. Combining the two terms, d n < Cn 35 / 2 1 (n s ^ 2 + n 5q: ), where C depends on 
/, {j n } and 0. Taking e 2:fl = n^ 5( -^ Aa ^ ^/\ogn and using (I5.34p . Proposition 13. 2\ we obtain 



Pr 



no 



U n ( 



> e 2 ,n I < exp (- log n) = Cn 1 



when n is large enough. Consequently, 

sup Pr {t - e 2i „ < Z < t + e 2 ,„} < n~ 5 ^ y/fy 



n. 



(5.37) 



(5.38) 



We then deal with sup t |Pr{S' nn < t} — Pi{Z <t}\. Let Ti be the cx-field generated by 
{X 1 ,X 2 , ...,Xi} for % = 1,2, .... We first observe that, by the definitions in fl5T30|) - fl5T32|) . 



i-lj 



0. 



(5.39) 



and thus S n k is a martingale with respect to T\.. We will use the result of Erickson, 
Quine and Weber (1979) to derive a bound for sup 4 \PT{S nn < t] — Pr{Z < t}\. For 
i > 2, let X' ni := X ni - fi ni , a 2 ni := E [X' n 2 \Ti-^) and o 2 n := YT%=2 a ni- Also define 
Y ni := TfpiHniX^Xj) and V 2 := £? =2 E 7^). 

Theorem 5.1 (Erickson, Quine, Weber, 1979). Given X = {X n i,i = 2,...,n;n = 1,2,...} 
and 7 = = 1,2, ...} ; let S nn := Y^=2 X m- If fi ni = for all n,i, then for 77 G (0, 1], 

there exists a constant C , 



V(3+r?) 



sup |Pr {S nn <t}- Pi{Z < t}\ < C \ ^E|X m | 2+ " + E|l - a 2 n \ 1+ ^ 2 
Consider the second term: 



i=2 



E|1-^| 2 = ^ 4 E|4-K 2 | 2 <^ 4 E(0- 



(5.40) 



(5.41) 



Set G n (x,y) =E(H n (X 1 ,x)H n (X 1 ,y)), then by the proof of Theorem 1, Hall (1984), 

E(V n 4 ) < C (n 4 EG 2 n (X 1 , X 2 ) + n 3 EG 2 n (X,, X,)) < C (n 4 EG 2 n (X 1; X 2 ) + n'EH^, X 2 )) . 

(5.42) 
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By (15.35P and Corollary 12 .7\ x n 4 ~ 6<5 . The calculations in Theorem 1, Zhang and Zheng 
(1999) can be applied here directly. H n (x,y) defined in (13. 2 j) is off by a scaling constant 
2 _2 - 7n n 2 from their definition. 



E# 4 (X l5 X 2 ) = (2- 2jn n 2 ) 4 0{2 3jn /n 8 ) = 0{2- 5jn ) = 0(n~ M ), 



and 



EG 2 n (X u X 2 ) = (2- 2 ^n 2 ) 4 0(2 J V^ 8 ) = 0{2~ 7jn ) = Oi 



n 



-7<5\ 



Combining these estimates and using Holder inequality, we see 

Ell - a 2 \ 1+ ^ 2 <Cn- 5 ^ 4 . 



(5.43) 
(5.44) 

(5.45) 



For the first term in ( 15.401) . we observe that 



n n 1 I 

J2nx n r r > E 

i=2 i=2 Sn \ 



i-l 



y^ y H n (Xj,Xj) 



3 \ (2+r?)/3 



(5.46) 



Let Ej denote the expectation with respect to X^ and Ej/ denote the expectation with 
respect to X±, ...,Xj_i. We can apply a Hoffmann- Jorgensen type inequality with respect 
to Ej/(Theorem 1.5.13, de la Pena and Gine, 1999), 



E 



i-l 



H n (Xi, Xj 

3=1 



' i-l 



< CE, <j Ei, i< max i \H n (X t , X,)\ 3 + j Ej/ ( J^H^X^Xj 



3/2 ' 



(5.47) 



The first term can be bounded using (12. 3p . For the second one, we use Jensen's inequality, 
Holder inequality and (I5.43P to get 



'i-l 



3/2 



E, ( Ej/ | ^2 H n (Xi, Xj) 



E, ((i - l^HUX^X,))" 2 < C{i - lfl 2 n-^ 



These inequalities and ££= 2 2 (2+,?)/2 < Cn 2 ^/ 2 lead to 



(5.48 



^E|X ra | 2+ " < Cn ^/2-lK2 +v ) n -5(2 +v) ^ max (l ) ,3/2 n -3V4 ) (2+,)/3 < Cn S/2 +V 6/A- V /2 



i=2 



i=2 



(5.49) 

Gathering (15.401) . ( 15.451) and (15.491) and noting that the bound is minimized when 77 = 1, 
we arrive at 



sup \Pr{S nn <t}- Pr{Z < t}\ < Cmax (n 35/w - 1/8 , n" 35/16 ) < CV 



3<5/16 



(5.50) 



Putting together the last inequality with (15T25|) . ([5728]) . ( 15729]) . ([5733]) . (1537) and ([5738]) . we 
conclude that when n is large enough (depending on / and 0), there exists a constant C 
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(depending on /, and {j n }), 



sup | Pr{n2~ in/2 J n /<7 <t}- Pt{Z < t}\ <C (n~ s ^ Aa) ^/hgn + 



n -3<5/16 



(5.51) 



□ 



<C(n- 35/m V n- a& ^\ogn). 
Taking C sufficiently large so that (I1.12p is true for all n. 

Appendix 

Proof of Lernma ?2~5[ By the definition of C n (s, t), 

2 jn I C*(s,t)dsdt = 2 3jn [ If K(2 jn t,2 jn x)K(2 jn s,2 jn x) 

J[-M,M} 2 J[-M,M] 2 { J - ■■ (A.l) 

K{2 j H, 2 jn y)K(2 jn s, 2 jn y)f(x)f(y)dxdy} dsdt 

By change of variables y = x — 2~ Jn -u, t = 2~i n w + x, s = 2~^ n z + x and the compactness 
of $, this integral is equal to 

/ / / / K(2 in x + z, 2 jn x)K(2 jn x + w, 2 jn x)K(2 jn x + z, 2 in x - u)K(2 jn x + w, 

J -A J -A J-2A JR 

2 jn x - u)f(x)f(x - 2- jn u)l(2- jn z + x G [-M, M])1(2" j "k; + x G [-M, M])dxdudzdw 

-A pA r 2A °° 

/ K(2 jn x + z + i, 2 jn x + i)K(2 jn x + w + i, 2 jn x + i) 




-A J -A J ~2A i= _ oc 



K(2 jn x + z + i, 2 jn x -u + i)K(2 jn x + w + i, 2 jn x -u + i)f{x + 2~ jn i)f(x + 2' jn i - 2~ jn u) 
l(2~ jn z + x + 2~H G [-M, M])l(2- jn w + x + 2~H G [-M, M])dxdudzdw. (A.2) 
Using K{x + 1, y + 1) = K(x, y) and change of variables, it is in turn equal to 



-A i-A p2A °° r i 
-A J~A J-2A 



oo „i 

/ 2~i"K(x + z, x)K(x + w, x)K(x + z, x — u)K(x + w, x — u) 



l = — oo 



f(2- jn (x + i))f(2~ jn (x + i - u))l(2- jn (z + x + i) G [-M, M]) 

l(2^"(w + x + i) G [-M, M])dxdudzdw. (A.3) 
To continue, it is convenient to write 

oo 

2~ jn f(2- jn (x + i))f{2- jn {x + i-u))l{2- jn {z + x + t) G [— M, M]) 

i=— oo 

l(2^"(w + x + i) G [-M,M]) (A.4) 

oo 2A-1 -2.4-1"! 

]T + ^ + ]T h-^/(2- in (x + i))/(2-^(a; + z- W )) 

j=2yl i=-2A -oo J 

l(2 _in (z + rr + z) G [-M, M})l{2~ jn (w + x + i) G [— M, M]) 

=: h(jn) + h(jn) + h{jn) = Hjn)- 
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The next lemma proves the convergence of I(j n ). 
Lemma A.l. Assume that f is bounded. For fixed M > 0, 

/M 
f{y)dy (A.5) 
■M 

uniformly for x G [0, l],uG [-2 A, 2A] , z G [-A, A] , w G [- A, A] as n ->■ oo . 

Proof. To simplify the notation, let u' = x — u, z' = x + z, w' = x + w. Then v! G 
[-2A, 2A+1], z' G [-A, A + l], w' G [-A, A + l]. Consider h(j n ). The general summand 
of h(j n ) is zero if 2~ jn (-A + i)> M. 

(\V™M\-2A-\ [2^M\+a\ 
E + E 2-^/(2- j "(x + i))/(2-^(^ + i)) 
i=2A \2^M\-2AJ 

l(2- jn (z' + i) G [0,M])l(2^"(u/ + i) G [0,M]) 

where |_2 : '"MJ is the largest integer less than or equal to 23™ M. 

h(jn) is a finite sum with each summand bounded by a constant times 2~ 3n . So 
h(jn) -> uniformly for x G [0,1], u G [-2A,2A],z G [-A,A],u> G 

Setting Ay = 2~ jn (AA + 1), we can simplify i^jVi) since the indicator function in the 
general summand of hijn) must be 1. 

\p™M\-2A-\ 

h(jn) = E 2-i"f(2-i"(x + i))f(2-i«(u' + i)) 

i=2A (A.6) 

6A Ni v 1 

= AA + l ^ E A 2//( 2_J "(^ + + J A J /)/(2-^(«' + i) + j Ay), 

i=2A j=0 

where iVj is the largest j such that for fixed i, i + j{AA + 1) < [2 jn Mj - 2 A - 1. AT, = 
|_M/Ay — lj or \M / Ay — 2J depending on i. 

For each 2A < % < 6 A, consider the partition of [0, M\: 

P itn = {0, 2~ j "(i - 2A), 2~ j "(i - 2A) + Ay, 2~ jn (i - 2A) + (N t + l)Ay, M}. 

There are at most + 3 subintervals. Except for the first and the last subintervals, whose 
lengths we denote respectively by Ay^i and Ay^^+3, all the subintervals in this partition 
have length Ay = 2~ jn (4A + 1). We also have < Ay^ < Ay and < Ay i)Ni+z < Ay. 
Setting 

Ni 

S itn := / 2 (0) Ay itl + Ay/(2-^(x + i) + j Ay)f{2~^{u' + i) + jAy) + / 2 (M) Ay^ +3 , 

3=0 

(A.7) 
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we see that 

S t , n < / 2 (0)A^ 1 + ^M^.Ay + / 2 (M)A^ iV!+ 3 (A.8) 

j=0 

and 

Si, n > f 2 (0)Ay hl + mlAv + f{M)Ay i>Ni+3 , (A.9) 

j=0 

where Mjj and m^j denote respectively the supremum and the infimum of / on the 
partition [2~ j "(i - 2A) + jAy, 2~ jn (i - 2A) + (j + l)Ay\. As n -> oo, the mesh of P i>n 
tends to zero. Obviously, / 2 (0)A?/j jl + f 2 (M)Ay itN . +3 — > 0. / G Li and boundness 
of / implies that f 2 is Riemann integrable on [0, M\ for any M > 0. It follows that 
->■ io M / 2 for 2 A < z < 6 A and by (EOI) . 

AU) -> / f 2 (y) d y- (a.io) 

Jo 

Note that this convergence is uniform for x £ [0, 1] and u' £ [—2A,2A + 1], therefore, it 
is uniform for x £ [0,1], « £ [— 2A, 2A] , z £ [— A, A],w £ [—A, A]. We have thus proved 
that lim^oo h(j n ) = j^ 1 f 2 (y)dy uniformly for x,u,z,w in the corresponding intervals. 
By analogy, h(j n ) J° M f 2 (y)dy uniformly for x, u, z, w in the same intervals. 

Since / is bounded, h(jn) — > as n — > oo. 1 ) A. 51) is proved when collecting the results 
for h(jn), h(jn) and h(j n )- □ 

Lemma A. 2. Assume the scaling function <fi satisfies (SI) such that the kernel K asso- 
ciated with (f) is dominated by $ whose support is contained in [—A, A], where A is an 
integer. Then 

K(x + z, x)K(x + w, x)K(x + z, x — u)K(x + w, x — u)dxdudzdw = 1. 

(A.11) 



2AJ0 



Proof. Since K(x + z, x)K(x + w, x)K(x + z, x — u)K(x + w, x — u) is absolutely integrable, 
by Fubini's theorem, 



/A pA p2A pi 
jlj K{x + z, x)K(x + w, x)K(x + z, x — u)K(x + w, x — u)dxdudzdw 
■A J -A J-2A JO 




K(x + z,x)K(x + z,x — u)dz j K(x + w,x)K(x + w,x — u)dwdudx. 

(A.12) 
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We make the following observation: For any y and z, by orthogonality of 0, 

K(x, y)K(x, z)dx 



k^l 

For fixed x G [0, 1], by repeated applications of the above equation, 




K(x + z, x)K(x + z, x — u)dz / K(x + w, x)K(x + w, x — u)dwdu 

Jr (A.14) 
= K(x, x) = 4> 2 {x — k). 

fcez 

Finally we consider 



/ cf) 2 (x — k)dx = / (j) 2 {x — k)dx= I 



2 (x - k)dx = / 4>\x)dx = 1. (A.15) 



□ 



We now continue with the proof of Lemma 12.51 Since in Lemma IA.lt the convergence 
is uniform for x G [0,1], it G [— 2A, 2A],z G [— A, A],u> G [—A, A], then if n is sufficiently 
large, for fixed M > 0, 

/M 
/ 2 (t)dt (A.16) 
-M 

The quantity in (1A.3|) is bounded in absolute value by 



/A PA rZA pi 
/ / / I(j n )dxdudzdw 
-A J -A J-2A JO 

/A pA p2A rl pM 
/ / / / f(t)dtdxdudzdw < oo 
■A J -A J-2A JO J-M 



for n large. So, by Fubini, (IA.3I) is equal to 

»A r A p2A r l 



A J -A J-2A JO 



K(x + z, x)K{x + w, x)K(x + z, x — u)K(x + w,x — u) 1 (j n )dxdudzdw . 

(A.18) 

By dominated convergence and Lemmas IA11 \A.2\ it converges to J_ M f 2 (y)dy. □ 
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Proof of Lemma \2.Q[ Choosing M to be an integer such that M > L + 2 ^ n (AA + 1), 
we divide the plane R 2 into four regions: [-M, M} 2 , [-M, M] c x [-M, M] c , [-M, M] x 
[-M,M] C and [-M,M] C x [— M, M]. To get the rate at which f._ M M]2 C%(s,t)dsdt 



tends to j M M f 2 (y)dy, we estimate J(j n ) - f™ M f 2 {y)dy 

Ii{jn)i which was defined in (1A.4[) . can be decomposed into 4 terms as follows. 

([2^L\-2A-l \2^L]+2A-1 2^M-2A-1 2^M+A \ 
E + E + E + E 2->7(2-^(x+o) 
i=2A [^"^-2^ i=r2^'"Ll+2yl i=2i«M-2A/ 

f(2- jn (u' + i))l(2- jn (z' + i) G [0,M])l(2- J '"(w' + i) G [0,M]) 

= : J^O'n) + ^sOn) + I'M + I'rtin)- 

(A.19) 

I'iijn) is essentially the same as h(j n ) m (1A.6|) . We follow the argument from (1A.6|) to 
(jA.9j) but consider the interval [0, L] instead. Due to the hypothesis of Holder continuity, 
there exists C depending on / and {j n }, such that 



M 



M 2 - m%\ < \Mij + rriijWMij - m i3 \ < C(Ay) a < Cn 



—5a 



So we obtain 



Si. 



f{y)dy 



< CLn 



- 5a 



(A.20) 



(A.21) 



Obviously, f 2 (0)Ay it i and f 2 (L)Ay iN . +3 are both bounded by Cn s . From flA.6j) . for all 
x G [0, 1], v! G [-2A, 2A + 1], z' G [-A, A + 1], w' G A + 1], 



/Kin) - / 



< 



4A + 



1 6A 

+ i .4^ 



i=2A 



Si, n - f 2 (0)A yi>1 - f 2 (L)Ay it 



Ni+3 



< Cn 



-5a 



f(y)dy 



(A.22) 



for n large enough depending on {j n }. C depends on / and {j n }- 

Next we will look at I' e (j n ) and consider a partition P, jn on [L, M\. Let £y := 2 _Jn (a; + 
i) + jAy, £^ = 2~ Jn (V + i) + j 'Ay- Similar to fl A. 6[) . but for a different we write, 



I'M = iZTI E E A v/(&)/(&)- 

i=\2in£}+2A 3=0 

Since / is bounded and monotonically decreasing on [L, oo), it follows that 

•M-Ay i>N+3 N r M-Ay i>N+3 -Ay 



(A.23) 



f(y)dy < E Ay/(&)/(&) < C7Ay + 



/ J (y)dy. (A.24) 



L+Ay iA 



Thus when M >L + 2~ jn (4A + 1), for all ac G [0, 1], it' G [-2 A, 2A + 1], z' e [-A, A + 1], 
u/ G [—-A, A + 1] and n large enough depending on {j n }, 



M 



f(y)dy 



< Cn~\ 



(A.25) 
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where C depends on / and {j n }- We also have \I^,(j n )\ < Cn and \Ij(j n )\ < Cn . 
Collecting these bounds, 



M 



h(j n ) - / f(y)dy 



< C(n- 5a + n- d ) < Cn 



-5a 



(A.26) 



Now it's easy to see 
we get 



Hjn) ~ J M P(y)dy < Cn- 5a . By (pOft . fCQj) and Lemma 



2 J " 

-M,A/] 

The derivation of a bound on 



C n (s, t)dsdt 



f(y)dy 



M 



< Cn 



- 5a 



(A.27) 



is 



I[-m,m] c J[-m,m] c ^n( s > t)dsdt — Jj_ MM ]c P{y)dy 
similar. The analysis of the key component is analogous to /^(jn), where the monotonicity 
of the tail of / is used. 



2*. 



-M,M] C J[-M,M] 



Cl(s,t)dsdt 



■M,M] C 



f(y)dy 



< Cn' 5 . 



(A.28) 



It's easier to analyze the integral on the regions [— M, M] x [-M, M] c and [-M, M] c x 
[—M,M]. Both are bounded by Cn~ s since there are at most finitely many summands 
that are not zero. (12.131) follows by collecting the bounds on the four regions and taking 
C sufficiently large so that it is true for all n. 

□ 
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